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Linear vs. Quadratic Rules 2 
Abstract 

Both predictive discriminant analysis (PDA) and descriptive 
discriminant analysis (DDA) require a decision to pool group 
covariance matrices, or alternatively to retain separate group 
covariance matrices when the group covariance matrices are too 
dissimilar to pool together. Pooling the group covariance 
matrices invokes the so-called "linear" rule, generally 
preferred in predictive and descriptive analysis. Retaining 
separate group covariance matrices invokes the "quadratic" rule, 
resulting in a higher hit rate in PDA and a lower lambda in DDA. 
However, the quadratic rule is influenced by unique sampling 
error variance, therefore the generalizability of quadratic 
results is suspect. 
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Multivariate statistics provide adept researchers with 
methods that (a) control experimentwise error rate, and (b) 
honor a complex reality with multiple causes and multiple 
effects (Fish, 1988; Thompson, 1997). According to empirical 
studies (Emmons, Stallings & Layne, 1990), "In the last 20 
years, the use of multivariate statistics has become 
commonplace" (Grimm, & Yarnold, 1995, p. vii) . However, with 
discriminant analysis, common use does not ensure responsible 
use. Responsible use of discriminant analysis depends on 
distinguishing between predictive (PDA) and descriptive (DDA) 
discriminant analysis (Huberty, 1994; Huberty & Barton, 1989; 
Huberty & Wisenbaker, 1992). 

Broadly speaking, discriminant analysis either predicts or 
describes group membership. As Huberty and Lowman (1997) have 
noted: "Simply put, we have different analyses (PDA and DDA) for 
different questions; one is for prediction of group membership 
[PDA] and one is for description of grouping variable effects 
[DDA]" (p. 759). Even popular statistical software packages 
such as SAS and SPSS muddle the PDA/DDA distinction, providing 
misleading or incorrect information (Huberty & Lowman, 1997). 

In order to use either PDA or DDA, a researcher must decide 
whether to pool group covariance matrices, or retain separate 
group covariance matrices when the group covariance matrices are 
too dissimilar to pool together. Pooling group covariance 
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matrices invokes the so-called "linear" rule, generally 
preferred in predictive and descriptive analysis. Retaining 
separate group covariance matrices invokes the "quadratic" rule, 
resulting in a higher hit rate in PDA and a lower lambda in DDA. 
However, the quadratic rule is influenced by unique sampling 
error variance, making the generalizability of quadratic rules 
results suspect (Huberty, 1994). While using separate group 
covariance matrices improves PDA and DDA results for an 
individual study, the results are unlikely to replicate in 
future studies. 

Pooling Covariance Matrices 

Pooling group covariance matrices invokes the linear rule. 
While linear rule results may be less exciting and more 
conservative (i.e., have lower DDA lambdas or lower PDA hit 
rates) than quadratic rule results, linear rule results are not 
as susceptible to unique sampling error variance. However, 
pooling variance "... is legitimate if, and only if, the 
variabilities of the scores in each group are roughly the same " 
(Haase & Thompson, 1996, p. 6) . Assessing whether group 
covariance matrices are "roughly the same" is usually 
accomplished with statistical significance testing (Huberty & 
Lowman, 1997) . 

In the context of evaluating homogeneity of variance, 
Huberty (1994) noted three major problems with statistical 
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testing procedures: (a) the methods are extremely sensitive "to 

relatively small matrix discrepancies;" (b)" the degrees of 
freedom used in the chi square and F-tests are quite large even 
for modest-sized data sets;" and (c) the statistical tests such 
as Box's M are extremely sensitive "to lack of multivariate 
normality of the outcome variables vectors" (p. 70). 

Essentially, tests for homogeneity of variance are statistically 
powerful, have many, many degrees of freedom, and are influenced 
by lack of multivariate normality. 

Therefore, as in the univariate world, "common sense may be 
the best guide to evaluating whether the homogeneity of variance 
assumption has been met" (Haase & Thompson, 1996, p. 11) . A 
thoughtful researcher may use log determinant values that 
"provide an indication of which groups' covariance matrices 
differ most" (SPSS, Inc., 1998, p. 263). Furthermore, both SPSS 
and SAS outputs provide log determinants. Also, box plots and 
within group scatterplots may be useful in assessing homogeneity 
of group covariance matrices (SPSS, Inc., 1998). 

Comparing Distances 

Homogeneity of group covariance is critical, as Huberty 
(1994) noted: 

The basic requirement in comparing distances involving 
measures on two (or more) variables is that the same 
metric is used in computing the distances. One way 
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this is assured is, of course, if all standard 
deviations or variances are equal.... If this is not 
the case, the unequal variances must be 'taken into 
consideration.' This is accomplished by dividing the 
measures by the corresponding standard deviation. (p. 
42) 

Comparing distances that are not in the same metric is as 
nonsensical as comparing different currencies. 

For example, three types of distances are investigated in 
classic one-way analysis of variance. One, the distance of 
individual scores from the grand mean is the total sum of 
squares. Two, the distance of group means from the grand mean 
is the sum of squares between groups. And, three, the distance 
of individual scores from the group means pooled together is the 
sum of squares within groups. 

Again, as in predictive and descriptive discriminant 
analysis, pooling variance for the sum of squares within groups 
is " legitimate if, and only if, the variabilities of the scores 
in each group are roughly the same " (Haase & Thompson, 1996, p. 
6) . However, while in ANOVA pooling variance is not optional, 
in PDA and DDA, the linear and quadratic rules empower the 
researcher to determine whether or not group covariance matrices 
are similar enough to justify implementing the preferred linear 
rule . 
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In multivariate methods such as predictive and descriptive 
discriminant analyses, the Mahalanobis distances among 
individual points, distances among group centroids, and 
distances among points and centroids are explored and compared 
(Huberty, 1994). In addition to answering different research 
questions, another distinction between PDA and DDA is the type 
of distance used in the respective analyses. 

Distances Between Points 

Huberty (1994) explained that the Mahalanobis distance 
between points is determined by: 

A 2 ug = (X A - X B ) 'S' 1 (X A - X B ) 

In this equation, A 2 ab is a "squared generalized index between 
Pont A (defined by the column vector X A ) and Point B (defined by 
the column vector X B ) " (p. 43). The influence of unequal 
variance is "taken into consideration by using the inverse of 
the population covariance matrix S’ 1 " (Huberty, 1994, p. 43). 
Distances Among/Between Centroids 

The distance among or between group centroids is the 
distance of interest in descriptive discriminant analysis 
(Huberty, 1994). Group separation is determined by: 

A 2 i 2 = [ (Ui - Uz) ' S g _1 (Ui - ] 1 2 ) ]** 

In the formula for distances among group centroids, "E is the 
covariance matrix common to the two populations/ that is, the 
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covariance matrices are assumed to be equal" (Huberty, 1994, p. 
44). Using the covariance matrix ''common to the two 
populations" mitigates influence the of unique sample variance. 
In descriptive discriminant analysis, "the grouping variable 
plays the role of a predictor variable, and the £ response 

variables are outcome variables" (Huberty & Lowman, 1997, p. 

p 

759) . From the formula for distance among/between group 
centroids, it is evident that the focus of DDA is the distance 
between/among group centroids. 

Distances Among Points and Centroids 

The distances among points and centroids is the type of 
distance on which "emphasis is given in predictive discriminant 
analysis" (Huberty, 1994, p. 44). Huberty (1994) noted that the 
distance index among individual points and group centroids is 
calculated by: 

A 2 U g = [ <X U - Ug) ' Eg' 1 ( X U - Ug) ^ 

"where S g is the covariance matrix for population g" (Huberty, 
1994, p. 44). In PDA, group membership is the outcome variable, 
and the response variables are the predictors (Huberty & Lowman, 
1997) . From the formula for distance among points and group 
centroids, it is evident that the focus of PDA is the distance 
between points and group means. 
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Discussion 

The distinction between PDA and DDA is fundamental for 
responsible use of discriminant analysis. Typically, 
descriptive discriminant analysis is used after group membership 
is determined, as a post hoc method for describing how predictor 
variables reflect group membership (Huberty & Lowman, 1997). 
Because the response variables are the focus of DDA, 
standardized discriminant functions (multiplicative weights) , 
structure coefficients, should be reported. 

For meaningful interpretation of descriptive discriminant 
analysis SAS or SPSS output, the researcher should consult 
MANOVA results and lambda, an r 2 type-effect size. However, r 2 
type-effect sizes are "uncorrected for the positive bias in all 
variance-accounted-for effect sizes (due to ALL analyses being 
correlational and capitalizing on sampling error variance with 
the sampled data' s total variance that does not exist anywhere 
except in this particular sample)" (Thompson, 1997, p. 1). 
Therefore, using a quadratic rule (separate group covariance 
matrices) in DDA, then consulting lambda, compounds the 
influence of unique sample variance. 

In order to interpret predictive discriminant analysis 
results, the adept researcher should consult the hit 
rate/classification rate and linear classification functions 
(LCF) . However, when reporting PDA results, only the hit rate 
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should be reported, because the focus of PDA is the accuracy of 
group classification, not how well the predictors explain group 
membership. Thus, weights and structure coefficients are not 
relevant. In PDA, the most important response variable is the 
variable that most hurts hit rate when that variable is not used 
in the analysis (Thompson, 1998) . 

Both PDA and DDA involve a decision to retain separate 
group covariance matrices, the quadratic rule, or to pool group 
covariance matrices, the linear rule. This decision is 
analogous to assessing homogeneity of variance in analysis of 
variance. The quadratic rule typically produces a higher hit 
rate in predictive discriminant analysis and a lower lambda in 
descriptive discriminant analysis. However, the quadratic rule 
is influenced by unique sample variance, therefore the 
generalizability of quadratic rule results is suspect. 

Therefore, invoking the linear rule results is generally 
preferred . 
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